Part-of-Speech Tagging with Two Sequential Transducers
نویسنده
چکیده
The article presents a method of constructing and applying a cascade consisting of a leftand a right-sequential finite-state transducer, and , for part-of-speech disambiguation. In the process of POS tagging, every word is first assigned a unique ambiguity class that represents the set of alternative tags that this word can occur with. The sequence of the ambiguity classes of all words of one sentence is then mapped by to a sequence of reduced ambiguity classes where some of the less likely tags are removed. That sequence is finally mapped by to a sequence of single tags. Compared to a Hidden Markov model tagger, this transducer cascade has the advantage of significantly higher processing speed, but at the cost of slightly lower accuracy. Applications such as Information Retrieval, where the speed can be more important than accuracy, could benefit from this approach.
منابع مشابه
Finite State
This paper describes the conversion of a Hidden Markov Model into a sequential transducer that closely approximates the behavior of the stochastic model. This transformation is especially advantageous for part-of-speech tagging because the resulting transducer can be composed with other transducers that encode correction rules for the most frequent tagging errors. The speed of tagging is also i...
متن کاملFinite State Transducers Approximating Hidden Markov Models
This paper describes the conversion of a Hidden Markov Model into a sequential transducer that closely approximates the behavior of the stochastic model. This transformation is especially advantageous for part-of-speech tagging because the resulting transducer can be composed with other transducers that encode correction rules for the most frequent tagging errors. The speed of tagging is also i...
متن کاملUse of Weighted Finite State Transducers inPart of Speech
This paper addresses issues in part of speech disambiguation using nite-state transducers and presents two main contributions to the eld. One of them is the use of nite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted nite-state transducers. Another contribution is the successful combination of techni...
متن کاملUse of Weighted Finite State Transducers in Part of Speech Tagging
This paper addresses issues in part of speech disambiguation using finite-state transducers and presents two main contributions to the field. One of them is the use of finite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted finite-state transducers. Another contribution is the successful combination o...
متن کاملPart-of-Speech Tagging Using Parallel Weighted Finite-State Transducers
We use parallel weighted finite-state transducers to implement a part-of-speech tagger, which obtains state-of-the-art accuracy when used to tag the Europarl corpora for Finnish, Swedish and English. Our system consists of a weighted lexicon and a guesser combined with a bigram model factored into two weighted transducers. We use both lemmas and tag sequences in the bigram model, which guarante...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000